Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

Double complex convolution and attention aggregating recurrent network for speech enhancement

Bennian YU, Yongzhao ZHAN, Qirong MAO, Wenlong DONG, Honglin LIU

Journal of Computer Applications 2023, 43 (10): 3217-3224. DOI: 10.11772/j.issn.1001-9081.2022101533

Abstract （138）

HTML （4）

PDF （1993KB）（83）

Save

Aiming at the problems of limited representation of spectrogram feature correlation information and unsatisfactory denoising effect in the existing speech enhancement methods， a speech enhancement method of Double Complex Convolution and Attention Aggregating Recurrent Network （DCCARN） was proposed. Firstly， a double complex convolutional network was established to encode the two-branch information of the spectrogram features after the short-time Fourier transform. Secondly， the codes in the two branches were used in the inter- and and intra-feature-block attention mechanisms respectively， and different speech feature information was re-labeled. Secondly， the long-term sequence information was processed by Long Short-Term Memory （LSTM） network， and the spectrogram features were restored and aggregated by two decoders. Finally， the target speech waveform was generated by short-time inverse Fourier transform to achieve the purpose of suppressing noise. Experiments were carried out on the public dataset VBD （Voice Bank+DMAND） and the noise added dataset TIMIT. The results show that compared with the phase-aware Deep Complex Convolution Recurrent Network （DCCRN）， DCCARN has the Perceptual Evaluation of Speech Quality （PESQ） increased by 0.150 and 0.077 to 0.087 respectively. It is verified that the proposed method can capture the correlation information of spectrogram features more accurately， suppress noise more effectively， and improve speech intelligibility.

Table and Figures | Reference | Related Articles | Metrics